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ABSTRACT 

o, 

^^ , We measure the topology (genus curve) of the galaxy distribution in a very large 

^^ I cosmological simulation designed to resemble closely results from the upcoming Sloan 

p ^ Digital Sky Survey (SDSS). The mock survey is based on a large N-body simulation 

■<^ ■ that uses 54,872,000 particles in a periodic cube 600/i~^Mpc on a side. The adopted 

t;;j- ' inflationary cold dark matter (CDM) model has parameters f^cDM = 0.4, J7a = 0.6, 

^ ■ h = 0.6, b = 1.3. We "observe" this simulation to produce a simulated redshift catalog 

^S| . of ~ 10^ galaxies over vr steradians, mimicking the ancitipated spectroscopic selection 

^ i procedures of the SDSS in some detail. Sky maps, redshift slices, and 3-D contour 

^T) ', maps of the mock survey reveal a rich and complex structure, including networks of 

^^ I voids and superclusters that resemble the patterns seen in the CfA redshift survey and 

Q ■ the Las Campanas Redshift Survey (LCRS). The 3-D genus curve can be measured 

O^ ■ from the simulated catalog with superb precision; this curve not only has the general 

■^ ■ shape predicted for Gaussian, random phase initial conditions, but the error bars are 

p . small enough to demonstrate with high significance the subtle departures from this 

JL i shape caused by non-linear gravitational evolution on a 10/i^^Mpc smoothing scale 
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(where cigai = 0.4). These distortions have the form predicted by Matsubara's (1994) 
perturbative analysis, but they are smaller in amplitude. We also measure the 3-D 
genus curve of the radial peculiar velocity field measured by applying distance-indicator 
relations (with realistic errors) to the mock catalog. The genus curve is consistent with 
the Gaussian random phase prediction, though it is of relatively low precision because 
of the large smoothing length required to overcome noise in the measured velocity field. 
Finally, we measure the 2-D topology in redshift slices, similar to those which will 
become available early in the course of the SDSS and to those already observed in the 
LCRS. The genus curves of these slices are consistent with the observed genus curves 
of the LCRS, suggesting that our inflationary CDM model with J^cdm ~ 0.4 is a good 
choice. Our mock redshift catalog is publicly available for the use of other researchers. 
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1. Introduction 

The nature of large-scale structure in the universe is one of the pre-eminent questions of 
modern astronomy. Gravitational growth of structure from quantum fluctuations during inflation 
has become the leading candidate for production of this structure. The topology of large-scale 
structure as quantified by the "genus curve" (Gott, Melott, & Dickinson 1986, hereafter GMD; 
Hamilton, Gott, & Weinberg 1986; Gott, Weinberg, & Melott 1987, hereafter GWM) tests one of 
the most robust predictions of inflation, namely that the quantum fluctuations produce a Gaussian 
random phase field of density perturbations, the seeds of the structures seen today. To date, all 
topological studies of galaxy redshift surveys (in 3-D and 2-D) and cosmic microwave background 
(CMB) anisotropies (in 2-D) have been consistent with Gaussian random phase initial conditions 
(Gott et al. 1989; Moore et al. 1992; Park, Gott, & da Costa 1992; Gott, Rhoads, & Postman 
1994; Vogeley et al. 1994; Colley, Gott, & Park 1996; Kogut et al. 1996; Colley 1997; Protogeros 
& Weinberg 1997; Canavezes et al. 1998; Springel et al. 1998). 

The genus test becomes only more powerful as cosmological surveys become larger. While 
several prodigious observational efforts have been made to survey the large-scale structure traced 
by optical galaxies (Geller &: Huchra 1989; Shectman et al. 1996) and by IRAS galaxies (Canavezes 
et al. 1998 and references therein), the 2-degree Field (2dF) redshift survey (see Colless 1998) and 
the Sloan Digital Sky Survey (SDSS)|^ will dwarf all currently existing redshift surveys in data 
volume. The SDSS will obtain approximately one million galaxy redshifts (compared to ~ 250, 000 
anticipated for 2dF), and the resulting high-precision measurement of the three-dimensional 
topology will therefore provide a far more powerful test of the random phase hypothesis than is 
possible today. In order to prepare for topological analysis and other statistical studies of this 
enormous sample, we have created a simulation of the SDSS using a very large N-body simulation 
{N = 54,827,000 particles). In this paper we show that the three-dimensional genus curve can 
be measured with unprecedented precision from the SDSS, as anticipated. We also show that it 
should be possible to measure the topology of the smoothed radial velocity field inferred from the 
SDSS images and spectra via distance-indicator relations. 

Early on in the SDSS, two-dimensional redshift slices, comparable to the deepest existing 
wide-angle redshift surveys, such as the Las Campanas Redshift Survey (LCRS, Shectman et 
al. 1996), will allow studies of the two-dimensional genus curve. Colley (1997) computed the 
two-dimensional topology of large-scale structure observed in the LCRS. We have therefore 
generated slices from our Simulated SDSS (SSDSS), intended to match closely those of the LCRS. 
We use particular statistical care to compare the topology of our simulation slices to that of the 
observed slices from the LCRS. Redshift slices (of constant angle in the sky) are complementary 
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to maps of the CMB, which mainly probe a sheh at constant radius (z ~ 1000). The topology of 
CMB fluctuations as observed by COBE (Smoot et al. 1994) has been computed by Colley et al. 
(1996) and Kogut et al. (1996). A similar 2-D topological approach has been applied to projected 
galaxy and cluster catalogs by Gott, Mao, Park &: Lahav (1992), Plionis, Valdarnini, & Coles 
(1992), Coles et al. (1993), and Davies k Coles (1993). 



2. The Cosmological Simulation 

The SDSS will measure the redshifts of approximately 10^ galaxies in tt steradians of the 
northern sky. This observational sample should provide an unprecedented opportunity to measure 
quantitatively the 3-D and 2-D topology of large-scale structure in the universe. 

Since, at its best, theory should predict observations before they are taken, we felt it 
important to simulate ahead of time the results to be obtained by the SDSS. One purpose of this 
is cosmological model testing, of course, but no less importantly we hope to show in a general way 
what features may be expected from a gravitational instability model. Perhaps most important of 
all, we hope to illustrate just how powerful the SDSS will be, i.e., just how accurately it will allow 
us to measure statistical quantities of interest, such as the genus curve. To this end, we have run 
a very large N-body simulation intended to mimick results from the SDSS. 

In 1992 the results from COBE (Smoot et al. 1992; Wright et al. 1992) provided great support 
for the gravitational instability picture by finally showing the long sought fluctuations in the 
microwave background predicted by this theory. The fluctuations found by COBE were consistent 
with a Harrison (1970)-Peebles & Yu (1970)-Zeldovich (1972) spectrum of fluctuations with a 
primordial index n = 1, Gaussian random phases (e.g. Colley et al. 1996; Kogut et al. 1996), and 
a standard, unbiased {b ^ I), Q = 1, h = 0.5 (where h = Ho/[100 km/s/Mpc]), inflationary cold 
dark matter (CDM) universe. It is noteworthy that the fluctuations seen by COBE are on scales 
larger than the horizon size at recombination, just the sort of fluctuations one would expect from 
random quantum fluctuations in a standard inflationary model (Guth 1981). 

While CDM inflationary models allow low amplitude microwave background fluctuations like 
those observed by COBE, they simultaneously produce structure which matches that observed at 
redshifts z ^ 5, because the smoothly distributed baryons can fall into pre-existing CDM potential 
wells after recombination and because the hierarchical form of the inflationary CDM power 
spectrum (Bardeen, Steinhardt, & Turner 1983) allows early formation of quasars and galaxies. 
Furthermore, adiabatic Gaussian fluctuations with this initial n = 1 power spectrum have proven 
amazingly successful at explaining the qualitative features of observed galaxy clustering, including 
the characteristic pattern of great walls and giant voids seen in the Geller & Huchra (1989) slices 
(White et al. 1987; Park 1990; Weinberg & Gunn 1990a) and lattice-like sequences of great walls in 
deep pencil beam surveys (Broadhurst et al. 1990; Park & Gott 1991a). Other successes of CDM 
include this very wide variety of remarkable properties in common with observation: deep sky 



maps show clusters, but are bland compared to 6° slices which show large voids; 20° wide slices 
show great walls; velocity fields show great attractors; the 3-D topology is spongelike; deep pencil 
beams show surprisingly regular lattices of great walls out to z = 0.5; deep slices to z = 0.2 show 
a more uniform appearance than shallow slices to z = 0.05; the cluster-cluster covariance function 
is a higher amplitude version of the galaxy-galaxy covariance function (Bahcall & Soneira 1983; 
Klypin & Kopylov 1983); and the three-point correlation functions of clusters and galaxies show 
the same triangular form (Gott, Gao, & Park 1991). Even if one were given complete freedom 
to construct an arbitrary, multi-parameter, geometrical model of large-scale structure, it is hard 
to see how one could come up with something that would reproduce all of these qualitative and 
quantitative successes of the physically motivated, dynamical model of gravitational structure 
formation from Gaussian initial conditions with an inflationary CDM power spectrum. 

Within the inflationary CDM paradigm, there are many reasons for selecting a model with 
il ~ 0.4. In models with scale-invariant inflationary fluctuations (n = 1), CDM-dominated matter 
content (r^HDM ^ f^CDMj ^B *^ ^cdm); and a standard relativistic background (CMB photons 
plus the usual light neutrino species), a low value of Qh is required to explain the observed shape of 
the galaxy power spectrum (see Peacock &: Dodds 1994 and numerous references therein) and the 
observed amplitude of the 3-D galaxy genus curve at large smoothing scales (see Moore et al. 1992; 
Vogeley et al. 1994; Canavezes et al. 1998). With the same restrictions, a value of J7 ~ 0.25 — 0.5 
is required to match simultaneously the COBE fluctuation amplitude and the mass function of 
galaxy clusters for tg ^ 13 Gyr (Cole et al. 1997 and references therein). Direct evidence for 
il ~ 0.4 includes the combination of the cluster baryon fraction with big bang nucleosynthesis 
constraints (Evrard 1997) and the combination of the z = 2.5 mass power spectrum inferred 
from the Lyman-alpha forest with the z = cluster mass function (Weinberg et al. 1998). This 
value of fl is consistent with some analyses of galaxy cluster evolution (e.g., Bahcall, Fan, & Cen 
1997; Eke, Cole, k Frenk 1998) but not with others (e.g., Blanchard k Bartlett 1998; Reichart 
et al. 1998; Viana k Liddle 1998). With a modest bias of the galaxy population in clusters it is 
consistent with observed cluster mass-to-light ratios (Carlberg et al. 1996, 1997). 

Once one adopts low 0,, the inclusion of a cosmological constant 0,\ = 1 — ^2 is theoretically 
attractive (Ratra k Peebles 1998; Peebles k Ratra 1998) because it maintains the flat spatial 
geometry that seems the most natural prediction of inflation, requiring no fine-tuning of the 
number of inflationary e-folds (though fine-tuning is still required to explain the present-day 
value of 0,\). Such a model could arise naturally within Linde's (1990) chaotic inflation scenario. 
Flat geometry also appears to be favored by some preliminary measurements of the angular 
location of the first Doppler peak in the CMB multipole spectrum (see, e.g., discussions by 
Lineweaver [1998] and Tegmark [1998]). A cosmological constant makes it easy to reconcile values 
of /i ~ 0.6 — 0.8 with the inferred ages of the oldest globular clusters. A value of 0,\ = 0.6 is 
low enough that gravitational lensing cross-sections are not an embarrassment (Fukugita et al. 
1992), and indeed Chiba k Yoshii (1999) argue that gravitational lensing statistics favor a value 
of 0,\ in this range. The recent inference of an accelerating cosmic expansion from observations 



of Type la supernovae (Riess et al. 1998; Perlmutter et al. 1999) provides additional and more 
direct evidence for a cosmological constant or some similar negative-pressure component of the 
universe. These and other arguments in favor of a low-density, flat universe have been summarized 
by a number of authors, including a recent analysis by Roos & Harun-or-Rashid (1999), who 
show that the combination of a variety of independent constraints favors Oa = 0.70 it 0.14 and 
r^M + ^A = 0.99 lb 0.16 (Id error bars) even without the supernova data. 

An alternative to the ACDM model is an open model with ilcDM < 1 and k = —1. An open 
universe can be produced in single-bubble inflationary models (cf. Gott 1982; Gott & Statler 1984; 
Gott 1986; Ratra & Peebles 1994, 1995; Bucher et al. 1995a,b; Yamamoto et al. 1995; Linde 1995; 
Linde & Mezhlnmian 1995; Hawking & Turok 1998). An fi = 0.4, Qa = 0.6, b = 1.3 model is quite 
similar, in terms of galaxy clustering, to an fi = 0.4, ^2^ = 0, b = 1.3 model, so our simulation can 
serve reasonably well as a stand-in for either. 

Ratra et al. (private communication) find the following 4-year-COBE normalized values with 
2-0" limits, all assuming Q.Bh? = 0.0125: 

J7 = 1, f^A = 0, /i = 0.54, to = 12 Gyr, 
Vih = 0.54, 0-8 = 1.1-1.5, 0.7 < 6 < 0.9; 

Vl = 0.4, VtK = 0.6, h = 0.6, to = 14.5 Gyr, (1) 

Q.h = 0.24, 0-8 = 0.82-1.2, 0.8 < 6 < 1.22; 

Vl = 0.4, f^A = 0, /i = 0.63, to = 12 Gyr, 
Vih = 0.252, as = 0.51-0.74, 1.35 <b< 1.96. 

Thus our simulation's bias parameter b = 1.3 (our simulation was done before the COBE 4-year 
normalization was available) is just above the 0.8 < b < 1.22 COBE 4-year normalization, so it has 
only marginally high bias. Its bias is slightly below the COBE 4-year normalization for an i7 = 0.4, 
ft\ = model (1.35 < b < 1.96). This modest level of bias is consistent with the predictions of 
hydrodynamic cosmological simulations (e.g.. Gen & Ostriker 1992; Katz, Hernquist, & Weinberg 
1992) and semi-analytic models of galaxy formation (e.g., Kauffmann, Nusser, & Steinmetz 1997). 

We thus adopt a GDM inflationary model with cosmological parameter values Q = 0.4, 
r^A = 0.6, k = 0, h = 0.6, to = 14.5 Gyr, and r.m.s. fractional mass fluctuation ug = 1/6 = 0.77 in 
spheres of radius 8/i~^Mpc. These values arguably provide the best fit to all current observational 
constraints. (Ostriker & Steinhardt [1995] independently picked as their favored model one 
with similar parameters; Turner [1998] also independently arrived at these as a promising set of 
parameters, roughly consistent with recent Type la supernovae data from Reiss et al. [1998] and 
Perlmutter et al. [1999], which favor Q ~ 0.2-0.3.) 

Our N-body simulation uses N = 380^ = 54, 872, 000 particles (the simulation was run 
in 1993; some results from it were published in Vogeley et al. [1994], prior to the appearance 



of this paper). The simulation uses a staggered-grid, particle- mesh code (Park 1990) with a 
600'^ density-potential mesh and a simulation volume of (600/i^^Mpc)^. Thus our gravitational 
force resolution is ~ l/i~^Mpc. A biased subset of 8,292,455 particles are chosen to represent 
galaxies, chosen as peaks that lie above a threshold 6/6rms = 0.8 when the CDM density field 
is smoothed over 0.71/i~^Mpc. The peak particles are identified using the Bardeen et al. (1986) 
peak-background split approximation, as discussed by Park (1991). This combination of smoothing 
scale and peak threshold yields a bias factor b = 1.3 between the r.m.s. fluctuations of galaxy 
particles and dark matter on large scales. 



3. The Mock Redshift Catalog 

We have attempted to model the anticipated selection properties of the SDSS redshift survey 
in some detail, in part because our simulated redshift survey is being used for a number of internal 
tests of the survey data analysis software and observing strategy. The present observational plan 
is to select galaxies in the main redshift sample based on their Petrosian (1976) magnitudes and 
half-light surface brightnesses in the r' band (see Fukugita et al. [1996] and Gunn et al. [1998] 
for discussions of the SDSS photometric system). Definitions of our Petrosian magnitude system 
and motivation for its use are given by Gunn & Weinberg (1995). In the notation of that paper 
we adopt parameters /i = 1/8 and /2 = 2 for our mock redshift catalog, which means that we 
define magnitudes within a circular aperture of radius 2Rp, where the Petrosian radius Rp is the 
radius at which the local surface brightness falls to 1/8 of the mean interior surface brightness. 
As a result of more recent tests, the SDSS is likely to adopt somewhat different values of /i and 
/2, but we do not expect these changes to substantially alter the survey selection function, since 
the magnitude limit will be adjusted to ensure the desired number of galaxy targets over the 
TT-steradian survey area. 

In order to compute the quantities used for target selection, we first assign two fundamental 
parameters to each of the 8,292,455 galaxy particles: a Bt luminosity and a Hubble type. 
The Bt luminosities are randomly drawn from a Schechter (1976) luminosity function with 
parameters M* = —19.68 + 51og/i, a = —1.07, truncated below 0.064L^, (about 3 magnitudes 
below M*) so that the space density of galaxies above the cutoff matches the space density 
n = 0.038/i^Mpc~ of galaxies in the N-body simulation. Hubble types E-Sc are assigned 
randomly, with probabilities modified according to local galaxy density in accord with the 
Postman & Geller (1984) morphology-density relation (see Narayanan, Berlind, & Weinberg 
[1998] for details). Bulge-to-disk ratios are assigned as a function of Hubble type based on Kent 
(1985), and bulge axis ratios are drawn from the distribution inferred by Ryden (1992). Spheroid 
half-light radii are assigned using Maoz & Rix's (1992) parametrization of the fundamental plane 
(Djorgovski & Davis 1985; Dressier et al. 1987) for ellipticals, modified according to Fig. 5(6) of 
Kent (1985) for spiral bulges. Disk half-light radii are assigned using Freeman's (1970) value for 
the typical central surface brightness. We add Gaussian scatter with r.m.s. of 0.15 in log^Q -D to 



the diameters D in order to ensure a population of high and low surface brightness galaxies; this 
scatter may be unrealistically large, at least for the bulge components. We compute K-corrected 
magnitudes in the SDSS filters using the spectral-energy distributions of Coleman, Wu, & 
Weedman (1980). Finally, we compute Petrosian magnitudes and half-light surface brightnesses 
assuming a de Vaucouleurs (1948) profile for the bulge components and an inclined exponential 
profile for the disk components. 

We adopt a half-light surface brightness threshold Hii2 = 22 mag arcsec"^ and a Petrosian 
magnitude limit r' = 17.9 (on the AB magnitude system, Oke & Gunn [1983]), yielding 872,377 
galaxies in the mock redshift catalog. The SDSS currently plans to target ~ 10^ luminous red 
ellipticals in addition to the main galaxy sample, using photometric redshifts to obtain a sparse, 
nearly volume-limited sample extending to z ~ 0.4. We attempt to model only the ~ 9 x 10^ main 
galaxy sample here because of the size of our simulation cube. 

Once we have decided which galaxies are bright enough to be included in the mock catalog 
(with origin at a corner of the periodic simulation cube), we need to set the survey sky coverage. 
The north Galactic cap portion of the SDSS will target a roughly elliptical region, 130° by 110°. 
One may construct this region by choosing a polar coordinate system {9, (p) centered on the center 
of the ellipse. The two foci A and B of the ellipse are along the major axis, each at a distance 
Of = 42.54° = cos"i[cos(130°/2)/cos(110°/2)]. Let (Og^cpg) be the polar coordinates of a galaxy 
in the sky. Its angular distances in the sky from foci A and B respectively are: 

(^gA = cos^^ [sin 6p sin 6g cos (j)g + cos 6g cos 9p] 
(^gB = cos~^ [— sin Op sin 6g cos (pg + cos 9g cos Op] ■ 

The galaxy is included in the elliptical survey region if 

+ 9gB < 130° (3) 



Fig. H shows a sky map of the ~ 900, 000 galaxies in the survey region. This figure bears a 
remarkable qualitative resemblance to the map of the real sky derived from the Shane- Wirtanen 
(1967) counts (reproduced on page 41 of Peebles [1993]), which reach to similar depth (without 
redshifts, of course). 

Fig. ^ shows how a 6° x 130° slice along the major axis of the survey looks when the observed 
galaxies are plotted in redshift space. Galaxy redshifts are converted into comoving distances using 
the redshift-comoving distance relation of the Oa = 0.6, 17 = 0.4 cosmological model. (A simple 
cubic approximation to this relation, Vc = 3000z — 940z^ + 1302;^ /i^^Mpc, has a maximum error of 
0.46/i~^Mpc out to z = 0.4.) The observer is located at the vertex of the fan, which extends to a 
redshift corresponding to a comoving distance of 500/i~^Mpc. Because we have a lower luminosity 
limit for galaxies (3 magnitudes below M* in B), this magnitude-limited sample is incomplete at 
distances below ~ 130/i~^Mpc; the combination of this limit with the decreasing physical width 
of the slice causes the apparent deficit of galaxies near the vertex of the fan. In the region from 
130/i~^Mpc to 400/i~^Mpc where the density of galaxies is highest and the clustering is most 



easily seen, we find a wealth of structure. The many small "fingers of god" pointing toward the 
observer show the locations of clusters; each "finger" corresponds to a roughly spherical cluster, 
which is stretched in redshift space by the peculiar velocities of its member galaxies. Many voids 
can be seen with typical sizes of 30h~^M]3c to 50/i~^Mpc, quite similar to those observed in the 
de Lapparent, Geller & Huchra (1986) 6° slice, which goes out to 150/i^^Mpc. This pattern of 
voids is quite typical of CDM models (cf. Park 1990). Many small and "great" walls (Geller & 
Huchra 1989) are visible, frequently up to 150/i^^Mpc in length, corresponding to the "pancake" 
structures that Zel'dovich (1970) argued would be a natural consequence of gravitational clustering 
from Gaussian initial conditions (see also Shandarin & Zel'dovich 1989). There is even a "great 
wall complex" extending for ~ 400/i~^Mpc across the slice, at distances ~ 150 — 300/i~^Mpc from 
the observer. Such features are also seen in 0, = 1, h = 0.5, b = 2 CDM models (Park 1990). 
The visual difference between this model with 0,h = 0.24 and a standard Qh = 0.5 CDM model 
is in the low amplitude, rolling hills and valleys in the distribution. The extra modulation of the 
basic network of voids and walls reflects the extra power on large scales present in an fi/i = 0.24 
CDM model. Nonetheless, we can see that the structure in the slice is approaching a qualitatively 
"fair sample" of this simulated universe: this fan has a much more uniform appearance than the 
de Lapparent et al. (1986) 6° slice, which only went out to 150ft,^^Mpc. Any theory based on 
Friedmann cosmological models must approach uniformity on large scales. In the Qh = 0.24 CDM 
model, the power spectrum P{k) peaks at wavelength A ~ 260/i^^Mpc, and it drops at larger 
scales, approaching P{k) oc A; at very long wavelengths. 

In addition to large voids 30-50/i^^Mpc across, there are some void complexes approximately 
100/i~^Mpc across (like the one at 1 o'clock at 250/i~^Mpc distance), where we have several low 
density voids next to each other separated by only relatively weak walls. Weinberg & Gunn 
(1990b) show time sequences of the formation of such void complexes in models with k~^ power 
spectra. Galaxies drain off the walls between voids, and the walls eventually become so tenuous 
that they are barely noticeable. As pointed out by Park et al. (1992), such a case of a tenuous 
wall inside a large void is seen in the famous de Lapparent et al. (1986) slice. As a CDM model 
evolves in time, voids grow by galaxies flowing off of minor caustics onto major caustics. This is 
also the mechanism for the enhancement of great walls. 

Overall this slice, which goes to a depth of 500/i^^Mpc, looks remarkably like the sandwich 
of three 1.5° wide slices (with two 1.5° wide gaps) from the LCRS (Shectman et al. 1996). Since 
our simulation was run before completion of the LCRS, we were delighted to see how the LCRS 
qualitatively confirmed many of the features of our simulation slice. It shows the same wall and 
void structure, the same rolling density at large scales, the same void and wall complexes, and 
the same overall approach to uniformity on very large scales. The LCRS shows a power spectrum 
consistent with CDM and 0.2 < 0,h < 0.3 (Lin et al. 1996) and a 2-D topology consistent with 
random phase initial conditions (Colley 1997), as assumed in our simulation. 



4. Measuring the 3-D Topology of Large-Scale Structure 

Figs. 0(a) and 0(6) show contours of constant density in a subset of the three-dimensional 
simulation out to 500/i~^Mpc. We have created a volume-limited sample for these figures by 
throwing out those galaxies that would not be visible at 500/i~^Mpc. After smoothing with a 
Gaussian filter of radius Rs = 10/i^^Mpc, we have selected contour surfaces at the median density 
contour |^(a)], and at the 93rd-percentile density contour ^{b)] (such that the 7% of the volume 
with highest density is contained within this contour). The median density contour surface 
is sponge-like, as expected from Gaussian random phase initial conditions (GMD), while Fig. 
^(6) shows isolated clusters at the 93rd-percentile density, as expected for this contour (GWM). 
These figures illustrate the hundreds of structures that will be available to constrain the 3-D genus 
in the complete SDSS. 

In order to quantify the topology of the three-dimensional large-scale structure in our mock 
redshift catalog, we follow the approach outlined by GMD and GWM. We start with the galaxy 
density field ((5-functions at the location of galaxies) and smooth it with a smoothing function 
W{r) = e~^ /2^s^ with R^ larger than the mean interparticle separation, to produce a smoothed 
density field. Iso density contour surfaces can then be constructed, and the genus G^d of such a 
surface can be defined as 

GsD = No. of Holes — No. of isolated regions, (4) 

where "hole" means hole like a donut has and "isolated region" refers to a topologically separate, 
isolated structure. For example, a surface density contour surface which consisted of 10 spherical 
pieces each surrounding an isolated spherical cluster would have a genus of G^d = —10. GMD 
prove that 

G'3D = -^y'i^'iA (5) 

where K is the Gaussian curvature (= l/rir2 where ri and r2 are the two principal radii of 
curvature) and the integral is performed over the contour surface. Contour surfaces can be labeled 
by the u value, which is related to the volume fraction of space / that they enclose: 

In the case of a Gaussian one-point probability distribution, the value of v in equation (^ is 
equal to 6/a, the density contrast in units of the standard deviation. The definition of contours in 
terms of fractional volume rather than density contrast greatly reduces the influence of non-linear 
gravitational evolution and biased galaxy formation on the genus curve, a point emphasized by 
GMD and GWM. 

For a Gaussian random phase distribution, the genus per unit volume is 

g3D{i^) = A{l-iy^)expi-i^y2), (7) 
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where 

1 jP{k)k'^d^k 

(47r)2 JP{k)d^k ^' 

and P{k) is the power-spectrum of the smoothed density field (Doroshkevich 1970; Adler 1981; 

Bardeen et al. 1986; Hamilton, Gott &; Weinberg 1986). The median density contour which 

encloses half the volume {v = 0) is sponge-like {g3D > 0, i.e., many holes, see Fig. ^(a). For 

/ < 0.16, 1/ > 1, we have g < 0, showing that we expect to see isolated clusters [see Fig. |3|(6)]. 

Numerous N-body experiments have shown that models which start off with Gaussian random 

phase initial conditions, as we might expect from an inflationary model (where the fluctuations 

are due to random quantum noise), retain density fluctuations with an approximately random 

phase topology into the mildly non-linear regime (e.g., Melott, Weinberg, & Gott 1988). Since 

the biased galaxy density is generally a monotonic function of the underlying mass density, even 

the biased galaxy distribution will be approximately random phase when the contours are defined 

by the volume fraction they enclose. However, important small deviations from random phase 

topology can occur. Biasing, which systematically locates more proto-galaxies near peaks in the 

initial density field, produces a small shift in the random phase topology curve (see equation 0) 

to the left, a "meatball shift" indicating a slight preference for isolated clusters over isolated voids. 

Non-linear gravitational evolution can also lead to a meatball shift if the smoothing length Rg is 

of order the correlation length. Non-linear gravitational evolution and biasing can interact with 

each other to produce an enhanced meatball effect in the standard CDM model (Park & Gott 

1991b). Because of these effects, the standard biased CDM Qh = 0.5, 6 = 2 model produces a 

slight meatball shift in the genus curve. The hot dark matter model, in which small scale power is 

damped by Landau damping, shows a slight bubble shift (to the right) in the non-linear regime 

because voids inflate to larger volume (Melott et al. 1988). Whether non-linear gravitational 

effects will produce a slight bubble shift or a slight meatball shift in the topology depends on the 

slope of the power spectrum at approximately the correlation length scale. Non-linear evolution 

also causes the amplitude of the genus curve to drop below that of the initial density field (Melott 

et al. 1988; Park & Gott 1991b; Springel et al. 1998). 

The points in Figs. Ma) and Wb) show genus curves measured from a volume-limited 
sample of our mock SDSS catalog, in real space and redshift space, for smoothing lengths of 
Rs = 5/i^^Mpc and Rg = 10/i^^Mpc respectively. Fig. ^(a)derives from a sample with a depth of 
Rmax = 500/i^^Mpc, while Fig. ^(6)derives from a sample out to Rmax = 300/i^^Mpc, so that in 
each case the mean separation of galaxies is approximately equal to the smoothing length. We 
use a procedure similar to that developed by Gott et al. (1989) for measuring the topology of 
existing galaxy redshift surveys. We first create a galaxy density field on a grid representing a 
cube 1000/i~^Mpc on a side, then convolve this density field with a Gaussian smoothing filter, 

using the fast Fourier transform (FFT). We set the smoothing length Rg equal to the mean 
separation d. With this definition, Rg is larger by v2 than the smoothing parameter A used by 
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Gott et al. (1989), and this smoothing criterion is therefore more conservative than that used by 
Gott et al. (1989) and other observational analyses, which have typically set A ~ d. Because of the 
enormous number of galaxies in the Sloan sample, we can afford to use a smoothing criterion that 
suppresses shot noise more completely. 

We set the galaxy density to zero outside the survey volume, so we must normalize the 
smoothed density at a given location by the fraction of the smoothing window that lies within 
the survey. Technically, we achieve this by creating a "mask" array that is one within the survey 
volume and zero outside, smoothing this mask, and dividing the smoothed galaxy density field 
by the smoothed mask (see Melott & Dominik 1993). The effective smoothing volume near the 
survey boundary is thus smaller than the smoothing volume in the bulk of the survey by up to a 
factor of two. In order to avoid any systematic biases arising in these border cells, we measure the 
topology only in the volume within which the smoothed mask has a value of at least 0.8, implying 
that at most 20% of the smoothing window extends outside of the sample. 

The dark points in Figs, ^(a) and Q(6) show the results of applying this procedure to the 
simulation's real-space galaxy density field. We compute the genus using a slightly modified 
version of the program CONTOUR (Weinberg 1988), which is based on the curvature summation 
algorithm proposed by GMD (for an alternative algorithm see Coles, Davies, &: Pearson 1996). We 
use 27 values of the threshold parameter v (as defined by volume fraction / as described above), 
ranging from v = —3.25 to z^ = 3.25 in steps of 0.25. We reduce small scale jitter in the genus 
curve by averaging over small ranges in z^: a point at v actually represents the average genus of 
contours at (,g){y) = 0.2 • X; 5(1^ + {-0.05, -0.025, 0., 0.025,0.05}). 

Ideally, we would estimate errors in the genus curve — and the covariance matrix of the 
errors — using a large number of mock catalogs like the one analyzed here, each drawn from an 
independent N-body simulation. By the time the Sloan survey is complete, a computationally 
ambitious approach like this may be feasible. For the present we rely on a less demanding 
procedure, using variation within subsets of the sample to estimate the uncertainty in the mean 
result. We divide the survey volume into four quadrants along the symmetry axes of the survey 
ellipse. We measure the topology separately in each of these four quadrants and compute 
a 1-0" error bar at each value of v from the \-a dispersion of the genus in each of the four 
quadrants divided by \/3- (The distribution of errors is in this case expected to follow a Student's 
t-distribution for iV = 4, cf. Colley [1997]). 

The results obtained from the mock catalog agree with the results from the full cube at about 
the level expected from the l-cr error bars, indicating that our techniques for dealing with the 
finite survey volume do not introduce systematic biases and that the quadrant procedure yields 
reasonable error estimates. 

The open points in Figs, ^(a) and|^(6) show the genus curve obtained from the redshift space 
galaxy distribution, with error bars estimated from the dispersion of values in the four quadrants. 
Peculiar velocities have only a small systematic effect on the genus curve, raising it slightly at 
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z^ ^ 1 and lowering the peak slightly at z^ ~ 0. Matsubara (1996) noted that in the linear regime, 
peculiar velocity distortions (Kaiser 1987) cause the genus curve in redshift space to decrease in 
amplitude relative to the real space genus curve while retaining the same theoretical form. 

We observe in Figs, ^(a) and^6) that the genus curve shows no discernible left-right shift 
(the peak is still at 1^ = 0), but there are slightly more clusters observed aX u = \ than voids at 
V = —1. This effect of non-linear gravitational evolution was first pointed out by Park and Gott 
(1991b), who noted its presence in the genus curves of evolved CDM mass distributions. They 
also noted that the effect was more pronounced in the biased particle distributions because of the 
additional impact of biasing. 

Matsubara (1994) has calculated the expected behavior of the genus of Gaussian random 
phase initial conditions after weakly non-linear gravitational evolution. He provides a perturbative 
analysis of the genus, in which odd terms in u add to the usual, even (1 — u"^) term in the 
three-dimensional genus curve for Gaussian random phase fields. Note that Matsubara (1994) 
uses the definition v^j = bja instead of the implicit definition in terms of volume fraction that we 
adopt (eq. [^). The two definitions are identical only when the one-point probability distribution 
is exactly Gaussian. Matsubara's relevant result for the three-dimensional genus is the equation 



G^T)(vo)=Ae-^'^l''\-H2(v,)^G 
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where we have introduced a number of new terms. The HnS are Hermite polynomials of order n 
{Hi = V(j, H2 = y^ — 1, H3 = u^ — 3fCT, H^ = v^ — Wu^ + ISt'o-), and a is the r.m.s. fluctuation 
in the smoothed galaxy distribution. S,T and U are tabulated by Matsubara (1994) for various 
power spectra, the most relevant of which is n = — 1. 

In Fig. |5|, we have plotted the genus in terms of both our usual u definition for density 
thresholds, and in terms of strict standard deviations (I'a), as Matsubara (1994) recommends. 
Also, we have restricted the range of the plot to reflect the limits suggested by Matsubara & 
Suto (1996), —0.2 < vcr < 0.4. The heavy solid curve shows the random phase theoretical 



form, gsD = -AH2{v)e 



-u^/2 



A{\ — u'^)e ^ '^, that best fits the data points obtained from 



the SSDSS (in real space), with the amplitude A treated as a free parameter. Springel et al. 
(1998) suggest using the amplitude drop, the ratio of the fitted value of A to the value computed 
from the measured power spectrum via equation (P), as a diagnostic for the degree of non-linear 
gravitational evolution, and Canavezes et al. (1998) exploit this method to good effect in their 
analysis of the PSCZ redshift survey. We have not incorporated this technique into our present 
analysis, but it will certainly be valuable for the analysis of the real SDSS. 

For our fits of the pure Gaussian form of the genus curve, the x^ values (calculated assuming 
a diagonal covariance matrix with 6 degrees of freedom) are 33 for v and 102 for Vfj. If we allow 
ourselves to fit for the odd terms (S*, T and [/) in i/, we obtain the dashed curves, for which the x^ 
values are reduced to 10 and 7, respectively. These rather dramatic reductions indicate significant 
improvement in the fits, well beyond that expected by fitting three new independent parameters 
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alone (in which case x^ should decrease by 3, for three fewer degrees of freedom). 

If we do not fit for S, T, and U but instead use the values implied by Matsubara's (1994) 
analysis given the value of cr = 0.408 measured from the SSDSS density field smoothed at 
10/i~^Mpc, then we obtain the dotted line in Fig. |5|. This line improves the fit over random-phase 
for the f(j genus points (open points), but is a worse fit for our usual u points (filled points) than 
the original random-phase curve (with x^ values of 56 for Va- and 144 for i^). In Table 1, we have 
tabulated our fit values for S, T and U and listed the values expected by Matsubara (1994). 
While it seems that the genus curve is distorted by non-linear evolution in the direction predicted 
by Matsubara (1994), the amplitude of the distortions in the simulation is significantly smaller. 
Matsubara & Suto (1996) have also compared cosmological simulations to these predictions and 
found somewhat better agreement in some cases, particularly when a is lower than our 0.408 value. 

It is perhaps not surprising that Matsubara's perturbative treatment breaks down when the 
r.m.s. fluctuation amplitude is as large as a = 0.408. For example, in Matsubara (1994), at large 
(t's the genus becomes sponge-like again for i' < —2. Since his theoretical genus curve shows 
isolated voids for v = —2, the only way to obtain a positive genus for i/ < — 2 is if these isolated 
voids begin to look like isolated sponges at lower density thresholds, which seems implausible. By 
examining Fig. 1, of Matsubara (1994), we have found an approximate relation, v ^ — 1.7 — 0.18/(T, 
which describes the plausible applicability of the Matsubara's treatment (for n = l). 

The SSDSS redshift catalog yields spectacularly precise measurements of the genus curve 
at these smoothing lengths (5/i~^Mpc and 10/i^^Mpc). These genus curves clearly indicate (as 
they should) that the initial conditions were random phase (which they were). The departures 
from the pure Gaussian form of the genus curve, while subtle, are detected at high statistical 
significance, and they have the form predicted for non-linear evolution by Matsubara (1994) but 
lower amplitude. 



5. The Topology of the Peculiar Velocity Field 

The biggest uncertainty in mapping the mass density field comes from the fact that the 
observed luminosity density field does not necessarily follow the true mass density field. This 
problem, known as "biasing," prevents us from being able to map the mass density field directly 
with great confidence. 

We can circumvent biasing uncertainties by considering galaxies to be tracers of the peculiar 
velocity field rather than the density field. If the gravitational instability picture is valid, peculiar 
velocities uniquely reflect the mass density field. Bertschinger & Dekel (1989) argue that when 
adopting this approach, any type of galaxy, cluster, or massive object can serve as a test body 
whose motion is driven by the true density field. Moreover, the peculiar velocity field on a given 
scale responds to fiuctuations of a larger scale in the density field. Since mass density fiuctuations 
on large scales obey linear theory better than fiuctuations on small scales, this difference in range 



14 



is to our advantage. 

We can use the peculiar velocity field in our topological study of large-scale structure. If the 
density field is Gaussian (random phase), then the potential field is also Gaussian, and in the 
linear regime, the peculiar velocity field, being the gradient of the Gaussian potential field, should 
also be Gaussian (see e.g., Bardeen et al. 1986; Park et al. 1992). The radial peculiar velocity 
field (which is the only measurable component of the 3D velocity field), with the value of the 
peculiar radial velocity treated as a scalar, is also Gaussian in this case (A. J. S. Hamilton, private 
communication). Thus, we can extract information about large-scale structure by examining the 
topology of the radial peculiar velocity field. For example, if, by measuring the genus curve of the 
radial peculiar velocity field (i.e., measuring the genus of iso- velocity surfaces), we find that it is 
not Gaussian, we must conclude that the density field is also not Gaussian. 

The SDSS will provide high resolution spectra for approximately 10^ galaxies. In addition to 
redshifts, spectra will provide other information, including line-widths and velocity dispersions. 
These measurements can be used to obtain redshift-independent distance estimates to galaxies 
using methods such as the Tully-Fisher (TF; Tully & Fisher 1977) relation for spiral galaxies, the 
Faber-Jackson (FJ; Faber & Jackson 1976) and Dn-a (Dressier et al. 1987; Lynden-Bell et al. 
1988) relations for elliptical galaxies, and the Brightest Cluster Galaxy (BCG; Hoessel 1980; Lauer 
& Postman 1994) relation. These methods typically yield r.m.s. distance errors of 15% for an 
individual galaxy. The TF relation can typically be applied to SDSS spiral galaxies only if their 
redshifts are z ^ 0.05 because at smaller distances the 3"-diameter fiber aperture of the SDSS 
spectrographs does not subtend enough of the galaxy to yield a correct Ha line width. Even with 
this restriction, we anticipate that the SDSS will obtain redshift-independent distances for nearly 
250,000 galaxies (Knapp et al. 1997). 

We treat the simulation as a realistic data set by imbuing individual galaxies with 15% 
distance errors, creating an "observed" catalog that contains true redshifts of all the galaxies 
(since redshifts can be measured to high precision) and estimates of the galaxies' distances. We 
treat this "observed" catalog in the same way we would treat the real SDSS dataset. We give the 
galaxies estimated radial peculiar velocities 

Vr = cz- Hod, (11) 

where d represents the galaxies' estimated distances (with 15% la errors) and z represents the 
galaxies' true redshifts. In order to create the radial peculiar velocity field, we place the galaxies 
(with their estimated values of Vr) at their redshift distances, since at z ^ 0.05 the error due to the 
15% scatter in estimated distances greatly exceeds the few hundred km s"^ error due to typical 
galaxy peculiar velocities. It is not to our advantage to go out to the full depth of the survey 
because the further out we go the larger the errors in the velocity field become. We therefore limit 
the velocity survey at an outer depth of r^ax = 300/i~^Mpc. 

Once we have radial peculiar velocity estimates for all the galaxies in our survey volume, we 
smooth the data in order to obtain the smoothed large-scale radial velocity field. The velocity 
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estimates for individual galaxies have large uncertainties that are generated by the uncertainties 
in the distance measurements. The peculiar velocities of individual galaxies are usually much 
smaller than these uncertaintes, making them impossible to determine individually. By smoothing, 
however, we effectively bin the velocity data in smoothing volumes, each containing a large number 
of individual data points. In this way we can beat down the noise in the Vr field (velocity errors are 
divided by vN, where N is the effective number of galaxies contained in one smoothing volume). 
We smooth the simulated data with a Gaussian filter of smoothing radius Rg = 21/i^^Mpc. With 
this choice of smoothing length, the expected number of galaxies per smoothing volume at the 
outer edge of the velocity survey is ~ 1100, and the expected r.m.s. error in the smoothed radial 
velocity field is ~ 130 km/s. Of course, the error decreases at smaller distances. 

There is one further correction that needs to be made to the "observed" smoothed radial 
peculiar velocity map. We subtract from it a smoothed correcting map which contains the velocity 
field y/ = HoAd (i.e. Vr = 0), where Ad values are simply different randomly generated 15% 
(l-cr) distance errors for the galaxies than those previously used. This simple correction gets rid 
of systematic (Malmquist-type) effects quite well. 

As we did for the galaxy density field, we calculate the genus of iso-velocity contours in 
the smoothed radial velocity field using CONTOUR and estimate errors by breaking the survey 
volume into four identical (in shape and volume) sub-volumes and computing the scatter in the 
four independent estimates of the genus. 

Fig. g shows the True and "Observed" smoothed radial peculiar velocity fields (note that this 
is a different slice than shown in Fig. |2|). From Fig. |^, it is fairly obvious that the radial peculiar 
velocity field can be mapped out to a large scale, despite the large uncertainties in our distance 
measurements. The SDSS data should be sufficient for this purpose. Fig. g shows that the 
"Observed" map captures the most prominent features of the True map. There is some distortion, 
as expected, but overall the maps compare reasonably. The r.m.s. error in our "Observed" map 
(i.e., the r.m.s. pixel by pixel difference between it and the True map), 121.0 km/s, is smaller than 
the actual velocity amplitudes seen in our True map (whose r.m.s. value is 148.8 km/s). 

The genus curves for both the True and "Observed" (Fig. |^) radial peculiar velocity fields are 
very noisy, as expected, because of the small number of resolution elements in our survey volume. 
Nevertheless, both genus curves are roughly Gaussian (the theoretical curve fits our data points 
as well as we would expect given the l-cr error bars). 



6. The 2-D Topology of Redshift Slices 

Before discussing the 2-D topology of redshift slices, we should explain the somewhat 
untraditional coordinate system of the SDSS. A galaxy's position on the sky is defined by a 
survey latitude ij and a survey longitude A, but the nature of the constant latitude and constant 
longitude curves is backwards from the usual; the constant latitude curves are great circles that 



- 16- 

connect the survey poles (an east pole at 5 = 0, a = 18'*20™ and a west pole at 5 = 0, a = Q^20"^), 
and the constant longitude curves are small circles centered on these poles. The SDSS imaging 
observations are carried out in scanning mode (see Gunn et al. 1998), and the constant latitude 
curves are the scan tracks. 

Since imaging must precede spectroscopy in any given area of the survey, an early product 
from the SDSS is likely to be redshift survey slices, much like those of the LCRS. We have therefore 
selected six slices of constant survey latitude from the mock redshift catalog for 2-D topological 
analysis. These slices are centered on arcs of constant r], but they have constant angular width 
in the sky, not constant Aij; thus, while the slices' centers are great circles, their upper and 
lower boundaries are not. In order to allow more direct comparison to existing results from the 
LCRS, we have made these slices 1.5° x ~ 80°, rather than the 2.5° x 130° of a full SDSS imaging 
stripe. We choose three slices at low latitude (r/ = —33°, —30°, —27°) and three at high latitude 
{f] = 27°, 30°, 33°), again to obtain a sample similar to that of the full LCRS. 

The galaxy sample in the SSDSS is, roughly speaking, magnitude-limited at rrimax = 17.9 
in r' (see Section 3). As with any magnitude-limited survey, conversion of counts to real galaxy 
density requires a good understanding of selection effects. We can compensate for these effects by 
constructing the two-dimensional selection function, which at a given radius reflects the expected 
surface density of galaxies in the survey. 

First we find the maximum distance, Dmax,i, at which the ith galaxy (with actual distance Di 
and magnitude rrii) could be seen. To a good approximation, this maximum distance is 

I?ma^,^ = A • lOO-'^™™'"^"™'^ (12) 

though several small complications exist; we include effects such as limiting surface brightness and 
iT-corrections in constructing Dmax- With each Dmax,i computed, we can invoke Schmidt's (1968) 
y/ymax method to construct the expected volume-density of galaxies as a function of radius, 

/'«(0 = j| E d;;^U (13) 

(Gott et al. 1989), where Q.s is the solid angle of the slice. We multiply ps by radius r to account 
for the linearly expanding wedge-shaped profile of the slice, and also by a "shape factor" S (Park 
et al. 1992) which relates solid angle in the sky to azimuthal angle in the slice map 

as{r) = Srps{r). (14) 

Here S = [2sin(ii;/2)], where w is the constant angular width of the slice. Recall that the center of 
the slice follows a great circle of constant rj, which causes a difference from the shape factor given 
by Park et al. (1992) for declination slices, which do not follow great circles. 

We now cut a slice from this azimuthally symmetric selection function with the same longitude 
spread as the actual slice. We then smooth both the selection function slice and the actual survey 
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slice with a Gaussian disc, e~^ ' ^^= . When we divide the smoothed survey shce by the smoothed 
selection shce, we have a map where the effects of the flux limit (and edge effects) have been 

— 1/2 

minimized. We truncate the slice at the radius where the smoothing length, Rg, equals as to 
avoid shot noise sampling effects (we actually truncate one smoothing length inside of that radius 
to reduce these effects further). 

We have chosen our smoothing length to ensure that most of the structures detected are 
well within the linear regime, because in the linear regime fluctuations at fixed position simply 
grow in amplitude in proportion to the growth factor, so that the topology of the present-day 
fluctuations is similar to the topology in the initial fluctuations. Non-linear effects typically 
become important on scales below about 8/i~^Mpc, so we have chosen a smoothing convolution 
kernel of exp(— r^/2i?g), with Rg = 20/i~^Mpc. (As mentioned previously, Matsubara [1996] has 
shown that peculiar velocities in the linear regime do not distort the form of the genus curve 
expected for Gaussian fluctuations.) 

Fig. ^ provides a contour map of one of the smoothed, calibrated slices {rj = —30°, width 
1.5°), with the galaxy locations over-plotted. The heavy lines are contours of high density; the 
lighter lines are contours of low density, and the dashed line is the median density contour. We 
will discuss the exact values of these contours below; for now we just wish to illustrate that the 
contours described correctly identify real over-densities and voids in the data. Also, many "fingers 
of God" are visible in the data; these are indicators of non-linear effects on small scales (such as 
virialized clusters). The structures visible in the contours, however, are much larger than the 
fingers of God, which suggests that we have smoothed out most of the non-linear features in the 
data (the r.m.s. fluctuation amplitude in the slices is o"2o/i-iMpc = 0-3) • 

As with the LCRS, we can immediately note the large number of structures, critical to a 
quantitative analysis via the genus statistic. When compared with Park et al. (1992), which 
uses the Geller &: Huchra (1989) survey, we find a vast improvement in the number of structures 
detected (by about a factor of 10), thus a much stronger lever-arm with which to measure topology 
statistics. 

Following Melott et al. (1989) and Gott et al. (1990), we define the two-dimensional genus 
G2D of the excursion set for a random density field on a plane as 

G2D = (number of isolated high-density regions)— , , 

(number of isolated low-density regions). 

Equivalently, the genus can be defined as the total curvature of the contours. Assuming a contour 
defines a differentiable curve C on the map, its total curvature is given by the integral 



K= Kds = 2ttG2d, (16) 

Jc 

where k is the local curvature, s parameterizes the curve, and G is the genus of the contour. An 
isolated overdense region will contribute -|-1 to the total map genus and a void ("hole") in it will 



18 



decrease the genus by 1. In practice, contours may cross the edge of the survey region; in that 
case the partial curves contribute non- integer rotation indices to the genus. 

A two-dimensional random phase Gaussian density field will generate a genus per unit area 

g = Ave-'''l\ (17) 

where ^ is a constant and v is the threshold value, related to the area fraction / by equation (|6|). 
The value of A depends on an integral of the detailed power spectrum of the fluctuations, but in 
the case of a perfect power-law spectrum with index n > — 1, 

^">-' = 2.(27r)3/2i?2' (18) 

where Rg is the Gaussian smoothing length (Melott et al. 1989; see also Adler 1981; Coles 1988; 
Park et al. 1990; Gott et al. 1992). We have explicitly used the area fraction in order to be less 
sensitive to non-linear effects and biasing as discussed in Section 4. Simulations have shown that 
the genus curve defined in this way more reliably reflects the genus curve of the initial fluctuations, 
whose nature we wish to test. 

In Fig. m, we have plotted the contour map of the smoothed density distributions in one of 
the SSDSS slices, with contours at u = {—2, —1,0, 1, 2}. The heavy lines represent v = {1, 2}, the 
light lines z/ = {— 2,— 1}, and the dashed line z^ = 0. The map shows many excursions at non-zero 
values of ly, while the median contour (z/ = 0) wanders through the map rather randomly, as 
expected. 

In Fig. ^, we have plotted as a function of v the mean genus per unit area (averaged from 
the estimates in each of the six slices). The best-fit theoretical genus curve expected for a 
random phase Gaussian distribution, equation (|l7|), is shown as a solid curve, with a best-fit 
value oi A = 0.7An>-i- The errorbars are the 68% (solid) and 95% (dotted) confidence limits, 
estimated from the formal Student's t-distribution for n = 6 (six slices) (Lupton 1993). This 
somewhat less familiar distribution is necessary whenever the error is estimated from the data 
distribution directly, as opposed to an independent estimation of the error (see Colley 1997). The 
t-distribution is equivalent to a Cauchy (Lorentzian) distribution for n = 2, but it converges to a 
Gaussian distribution quite rapidly (by n = 20, the difference from Gaussian is negligible in most 
applications). The i-distribution has broad wings to allow for accidentally low sample variances 
(s^), which cause the t-variate, {x — ^)/{s/\Jn — 1), to reach anomalously high values relative to 
the normal variate, (x — ^)/{(j/ ^/n), (x is the sample mean, // is the true mean, and a is the true 
standard deviation). 

The Gaussian-field theoretical curve (solid line) in Fig. y fits the SSDSS data points reasonably 
well. To be more quantitative about this, we perform a test related to the x^ statistic. Our 
"not-quite x^ statistic," x^, is computed in the usual way, but is different from a formal y^ in that 
we have used the formal l-o" errors as estimated from the six slices (recall the genus measurements 
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are t-distributed variates, not Gaussian), 

21 (- ~ \2 

X^ = E^^V^- (19) 

Here, the sum runs over the 21 values of v where we have measured the genus (as shown in 
Fig. ^); (ji is the mean genus among the six shces at each z/- value; gi is the value of the fitted 
solid curve, and cr-g.^est is the formal standard error in the mean as estimated from the six slices. 
For comparison, we ran 10^ simulations of 20 t-variates with n = 6 (one less than 21, due to the 
one-parameter fit). We then computed x^ for these datasets, and found that our observed value 
of 36.4 fell at the 70% confidence level (i.e., one would expect the value of x^ to be less than ours 
70% of the time, more 30% of the time). Thus, the results were nearly within a l-o" statistical 
agreement with the fitted curve theoretical curve g{v) oc ve~'^ '^. 

Although the above test accounts for the non-Gaussian error distribution on individual points, 
it does not account for the covariance of errors from one value of v to another. As one can see 
from Fig. P, the difference between the solid curve and the data points is systematic and coherent: 
the best-fit curve underestimates \g{i')\ at z^ > but overestimates \g{i')\ at z/ < 0. In order to 
assess the significance of this systematic departure from the Gaussian-field prediction, we turn 
to a more elaborate statistical technique developed by Colley (1997). Alternative approaches to 
dealing with correlated errors in genus curves are described by Vogeley et al. (1994), Protogeros & 
Weinberg (1997), and Springel et al. (1998). 

Since the genus values at nearby v values might be correlated (they are measuring essentially 
the same structures), we should look for a point-to-point correlation among the measured genus 
values. If the correlation is significant, there will be significant off-diagonal terms in the covariance 
matrix formed from the genus among the various v values. In this case, the commonly used x? 
treatment, which assumes a diagonal covariance matrix, must be replaced by a treatment which 
explicitly employs the full covariance matrix in the x^ computation. This more formal x^ statistic 
should be an apt figure of merit for the quality of fit of the theoretical genus curve to the genus 
values derived from a two-dimensional density field. 

Since the point-to-point covariance is not known a priori, we are left to produce independent 
model fields to estimate the covariance. The model fields are pure Gaussian random phase fields, 
where each Fourier mode has a random phase and amplitude derived from a Gaussian distribution 
with a mean of zero, and a variance equal to the value of the power spectrum -P(|fc|) at the 
k position of the mode in Fourier space. We used the two-dimensional power spectrum for an 
rj = 0.4, r^A = 0.6, h = 0.6 cosmology (i.e. the same power spectrum as in the simulation) in 
generating 100 sets of 6 Gaussian random phase slices with identical physical dimensions to the 
SSDSS slices. We applied exactly the same smoothing and genus topology routines to each model 
field as we did to the SSDSS slices, giving us a total of 101 independent model genus datasets, 100 
of which are guaranteed to derive from Gaussian, random phase fields. For each model dataset, 
m, we computed the best fit theoretical curve, and derived its pairwise covariance between the 
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various values of z/j and Uj. Model m has covariance 



^ij,m — \9i,m 9i,m)\9j,m 9j,m)' K^'^J 

Treating all datasets equivalently, we leave out each one in turn, and compute from the remaining 
100 a model covariance matrix as the average of Cjj.m over the remaining m values, i.e. 
{Cm)ij = J2m'^mCij,m' /WO. This allows a direct computation of Xm i^ dataset m, based on a 
completely independent covariance matrix. 

Xm = J2{9i,m - 9i,m){Cm)^^{9j,m - 9j,m)- (21) 

With this improved figure of merit for the fit, we can compare the fit derived from the observed 
data with that derived from the 100 model datasets. Without further assumption we may evaluate 
directly the rank of x^ from the real dataset among the 101 Xm values. We obtain for the SSDSS 
dataset a rank of 97 out of 101. The probability of obtaining a result that dramatic by chance is 5 
out of 101. The simulated slices are therefore just marginally inconsistent with the pure Gaussian 
random phase fields at the 95% confidence level. Colley (1997), using identical techniques, found 
that the LCRS agreed better with the simple Gaussian random phase genus curve, ranking 67th 
out of 101 in its x^ when compared to pure Gaussian random phase curves. The small but 
significant departure of the SSDSS from the Gaussian random phase curve is presumably due to 
non-linear effects of structure growth, since the SSDSS was seeded with Gaussian random phase 
initial conditions. In the following, we discuss non-linear effects on the genus and assess the 
statistical distinguishability of the LCRS genus and SSDSS genus. 

We reconsider Matsubara's (1994) equation for the 3-D genus curve after weakly non-linear 
evolution of structure, equation ([lO|). In dropping from three to two dimensions, we would expect 
the Hermite polynomial indices to drop by one, although the coefficients would not necessarily 
change in a trivial way. The treatment above fits for the coefficient of Hi[u) = u, the only term 
expected in a purely Gaussian random phase field (eq. ^\). This fit is shown by the solid curve 
in Fig. ^. If we now allow fitting for even terms, {Hq = 1, H2 = v"^ — 1, H4 = v"^ — Qv'^ + 3) we find 
a much better fit, shown by the dashed curve in Fig. ^. Adding these terms reduces the number 
of voids by 8% {u = —1) and increases the number of clusters by 16% {v = 1) relative to the best 
fit with equal numbers of clusters and voids. As in the three-dimensional case, the new terms 
drastically reduce the x? of the fit, from 36.4 to 13.1 (even a bit below the expected level of 17), 
much more of a reduction than 3, as expected when adding three new independent adjustable 
parameters. We also note that the apparent effects of non-linearity work in the same sign in both 
the three-dimensional and two-dimensional cases, in that there are slightly more clusters than 
voids in both cases. The coefficients of the even terms are roughly one order of magnitude smaller 
than the Hi coefficient, which again suggests that the effects of non-linearity seem to work in the 
direction suggested by Matsubara (1994), but not as substantially. 

For comparison with real observations, we have included an identical plot to Fig. P for the 



LCRS (Fig. IC). Upon inspection, one sees that, as in the SSDSS, the number of clusters in the 
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LCRS is larger than the number of voids, but the effect is not as large as it is in the SSDSS. 
Using identical statistical techniques to those above, Colley (1997) showed that the LCRS genus 
curve was consistent with a Gaussian random phase genus curve within the 1-a confidence hniit, 
while here we have shown that the SSDSS was marginally inconsistent at the 2-a (95%) level. 
In fitting for the non-linear terms in the LCRS, we have found that the coefficients of Hq, H2 
and H/i imply a 6% decrease in voids at v = —1, and a 7% increase in clusters at z/ = 1, relative 
to the best fit curve with equal numbers. When compared to the 8% decrease in voids and 16% 
increase in clusters in the SSDSS, we see that the LCRS genus curve is less distinguishable from 
random phase than is the SSDSS genus curve. Furthermore adding the three new fit parameters 
to the LCRS genus curve decreased x^ by only 4, (roughly) as expected when adding three new 
independent parameters. In the case of the SSDSS, adding the three new parameters decreased y^ 
by 23, indicating that the new parameters were important to the fit. 

A more direct comparison of the LCRS and SSDSS reveals that the genus curves derived 
from these surveys are consistent with each other. Using the \^ method described in the previous 
section, we have found that the LCRS and SSDSS are consistent well within the \-a consistency 
criterion. We now use a covariance matrix method similar to that discussed in the previous 



section (equations |20| and |22]); however instead of using {gi^m — 9i,m), the sample mean minus 
the best-fit value, as the deviate at the ith v value, we use {gs,i,m — 91,1,1)1 the mean genus value 
in SSDSS data set m minus the mean genus value in LCRS dataset £, where each dataset includes 
six slices. As before, we generated 100 fake datasets with six pure Gaussian random phase slices 
with identical geometry, power-spectrum and smoothing to those of the "real" datasets, for both 
the SSDSS and LCRS. This means we have 101 independent datasets for both the SSDSS and 
LCRS. We then use equations ( pO|) and (21), with the above substitutions to compute x\m f°i' *he 
10201 possible combinations of datasets: 

(C£m)ii = (10200)"^ ^ {gs,i,m'-gL,i,e'){gs,j,m'-gL,j,e'), (22) 

i',m'=/=e,m 
21 

xim = Yl i9s,i,ni - gL,i,e){Cem)^^igs,j,m - 9L,j,e)- (23) 

We find that x^ for the i = m = 101 dataset, where the real datasets of both the LCRS and SSDSS 
are compared, falls at the 46f/i percentile of all 10201 x^ values, an excellent agreement. This 
consistency between the LCRS and SSDSS is quite remarkable, as is recognizable when comparing 



Figs, y and |10| directly. In fact, the statistical similarity extends to the best-fit the amplitudes 
of the genus curves: Alcrs = 53 and Assdss = 56 (see equation |l3)- The expected error in 
each amplitude (known from the simulated sets) is of order 3. The consistency between these two 
datasets indicates that the model cosmology within the simulation produces a remarkably similar 
genus curve to that derived from the observations. 
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7. Availability of the Mock Catalog 



We anticipate making further use of this mock redshift catalog in our own preparations 
for analysis of the SDSS. In the hope that it may be useful to other researchers both 
inside and outside the SDSS collaboration, we are making the mock catalog available at 



http : //www . astronomy . ohio-state . edu/^dhw/ssdss . html (in the event of questions or 



difficulties, contact David Weinberg). This catalog complements the set of publicly available 2dF 
and SDSS mock catalogs described by Cole et al. (1998). The present catalog is drawn from a 
larger volume simulation [(600/i~^Mpc)^ vs. (345.6/i~^Mpc)^] and uses more careful modeling 
of the SDSS selection criteria. The Cole et al. simulations, on the other hand, have higher 
gravitational force resolution (90/i~^ kpc vs. l/i~^Mpc) and cover many cosmological models (20 
different sets of cosmological parameters and several biasing schemes). Thus, the present mock 
catalog is probably more suitable for studies that probe large scales or require careful matching of 
the anticipated SDSS selection function, while the Cole et al. mock catalogs are more useful for 
studies aimed at testing the ability of statistical diagnostics to distinguish between cosmological 
models with surveys the size of 2dF or the SDSS. For those wishing to create mock SDSS catalogs 
from their own large N-body simulations, the code used to assign galaxy properties and apply the 
SDSS selection criteria is available on request from David Weinberg. It is not trivial to use, but it 
is extensively commented. 



8. Conclusions 

We have constructed a mock survey to mimic the SDSS redshift survey of a million galaxies in 
the North Galactic Cap. We have used a large A^ = 54,872,000 body simulation with ^cdm = 0.4, 
Oa = 0.6, h = 0.6, b = 1.3, which has the Gaussian random phase initial conditions expected from 
an inflationary model, where perturbations arise from quantum fluctuations in the early universe. 
We measure the 3-D genus curve in this simulation and find that it indeed exhibits the shape 
predicted theoretically for Gaussian initial conditions. The SDSS will be able to measure the genus 
curve with unprecedented precision. A sponge-like topology with ~ 500 holes is measured here, 
giving a point- wise statistical precision of 4%, a vast improvement over previous surveys. Indeed, 
the data trace the random phase curve so well that observers confronted with such data would 
almost certainly conclude (correctly) that the mock universe had started with Gaussian random 
phase initial conditions. Small deviations from the random phase curve are, however, detectable at 
high statistical significance, the main effect being a slight excess of clusters over voids that results 
from non- linear gravitational evolution and biasing (Park &: Gott 1991b). Although the infiuence 
of non-Gaussanity of primordial fiuctuations on the genus curve dependes on the specifics of the 
theoretical model, the numerical studies of the topolgy of the texture model by Gooding et al. 
(1992) and of "generic" non-Gaussian models by Weinberg & Cole (1992) suggest that typical 
non-Guassian models produce distortions of the genus curve that could be detected easily at the 
high level of precision obtainable with the SDSSS. 
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In comparing to predicted effects of non-linear structure growth on the genus (Matsubara 
1994), we find the measured effects to be consistent in sign but not in magnitude. The Matsubara 
(1994) formula predicts a much larger (roughly two orders of magnitude) distortion of the genus 
curve than we observe. This discrepancy may result from using the formula outside its applicable 
range. 

We have also shown how the genus curve of the velocity field may be measured in the SDSS. 
Again, the results are consistent with the Gaussian random phase prediction, but the statistical 
precision is very low because a very large smoothing length is required to beat down the noise in 
redshift-independent distance measurements. 

Since the earliest products from the Sloan redshift survey will be 2-D slices, we have measured 
the topology (genus) of large-scale structure in such slices drawn from the mock survey. Again, we 
find that the number of clusters slightly exceeds the number of voids, but at a lesser level than 
would be presumably expected from Matsubara (1994). 

Finally, we compare our simulation slices directly with a very similar study of the (observed) 
Las Campanas Redshift Survey slices (Colley 1997). While both the LCRS and our simulation 
slices show a slight excess of clusters over voids, the effect is about twice as large in the simulation. 
The genus of the LCRS slices nonetheless agrees with the genus of the simulation slices, well 
within the l-cr interval, without any adjustment of amplitude in the curves. 

A better understanding of the non-linear structure growth and its effects on the genus curve 
will come with ever more elaborate cosmological simulations, and with the Sloan Digital Sky 
Survey itself. A survey this large is a powerful tool for evaluating theories of cosmic structure 
formation. 
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Table 1 



predicted best-fit (z^) best-fit (i/g-) 

S 3.468 -0.136 0.0501 

T 2.312 -0.0314 -0.116 

U 1.227 0.129 0.0512 

Table 1: Best-fit values of 5, T and U vs. values predicted by Matsubara (1994). In describing 
the genus of the large-scale distribution of galaxies, S, T and U, defined in equation (|To|), are 
coefficients of odd polynomials which add to the usual 2nd-order even polynomial, expected in 
the Gaussian random phase genus curve of three dimensions. The second column is for density 
thresholds described by v (equation ^); the third column is for density thresholds defined strictly 
by standard deviations fo- = 5 /a. Note that the range of the density thresholds for the fits has been 
limited to the range suggested by Matsubara & Suto (1996), who give —0.2 <v(t< 0.4, where a is 
the standard deviation in density fluctuation at this smoothing length {Rg = 10/i~^Mpc, a = 0.408). 



Fig. 1. — A projection onto the sky of the nearly 1 million galaxies in the Simulated Sloan Digital 
Sky Survey. The tickmarks indicate the boundaries of the six degree slice in Fig. |2[ 



Fig. 2. — A six degree slice of the Simulated Sloan Digital Sky Survey 



Fig. 3. — (a) Median density contour in the Simulated Sloan Digital Sky Survey, showing its sponge- 
like form. The observer is at the apex (left) and the radius of the sample is 500/i~^Mpc, and the 
smoothing length (c.f. equation Q) is Rg = 10/i~^Mpc. 



Fig. 3. — (6) 93rd density percentile contour in the Simulated Sloan Digital Sky Survey, showing 
isolated clusters, for the same sample and smoothing length shown in Figure 3(a). 
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Fig. 4. — (a) The genus curve of three-dimensional structure in the SSDSS, in real space (filled 
points, and solid curve), and redshift space (open points and dashed curve). The smoothing length 
is Rg = 5/i^^Mpc. The curves are the best fit for g = A ■ {1 — z/^) exp(— 1/^/2), expected for a 
Gaussian random phase field. Errorbars are l-a confidence intervals. 
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Fig. 4. — (6) as in H(a), but for a smoothing length of Rs = lO/i Mpc. 
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Fig. 5. — Comparison of the 3-D genus with predictions from Matsubara (1994) for weakly non- 
linear evolution of the genus. The heavy sohd curve is the best fit for the purely Gaussian random 
phase curve. The solid points reflect density thresholds v defined in equation ^; the open points 
reflect density thresholds which are strict standard deviations f^ = 5 /a. The long- and short- 
dashed curves include the best fits for amplitude and for S, T and U , given in equation (p^), in 
terms of v and Va respectively. These are coefficients of odd terms added to the usual even genus 
curve. The dotted curve is the prediction of Matsubara (1994). Note that the range of the density 
thresholds is delimited by Matsubara & Suto (1996), who suggest —0.2 < ucr < 0.4, where a is the 
standard deviation in density fluctuation at this smoothing length [Rg = 10ft,^^Mpc;o" = 0.408). 
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Fig. 6. — At top is the true radial peculiar velocity field in a slice of the simulated SDSS. At bottom 
is the "observed" radial peculiar velocity field. Both fields have been smoothed with a Gaussian 
filter of radius Rg = 21/i^^Mpc. The shaded regions have a positive radial peculiar velocity and 
the shaded contours represent velocities of +50 km/s and +100 km/s. The unshaded regions have 
a negative radial peculiar velocity and the unshaded contours represent velocities of —50 km/s and 
-100 km/s. 
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Fig. 7. — The genus curve of the "observed" radial pecuhar velocity field (left) and the true radial 
peculiar velocity field (right) of the SSDSS. Points show simulation measurements with l-cr error 
bars computed from the variance among four subsets of the survey, and smooth curves show the 
form expected for a Gaussian field (eq. |M]). 
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Fig. 8. — Contours oi u = {—2, —1,0, 1,2} in the r] = —30° slice of the SSDSS, after smoothing 
with a Gaussian filter e"*" '^^"^ with Rg = 20/i^^Mpc. The median density contour {v = 0) is 
dashed. High density contours are heavy and solid {u = {1,2}); low density contours are light and 
solid {v = {-2,-1}). Over-plotted are the galaxy locations themselves, so that the location of 
over dense regions and voids is obvious. 
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Fig. 9. — The genus curve of two-dimensional structure in SSDSS slices. The smoothing length is 
Rg = 20/i^^Mpc. The solid curve is the best fit for g = A- z^exp(— zy^/2), expected for a Gaussian 
random phase field. The dashed curve is the best fit for g = A[Hi{v) + BHq{v) + CH2{v) + 
DHi{u)\e'x.'p{—v'^/2), where Hn is the Hermite polynomial of degree Hn {Hi{v) = u\. Points 
are the average results from six 1.5° slices; solid and dotted error bars show the 68.3% and 95% 
confidence intervals on each g{u) measurement computed using the Student's t-distribution and the 
variance among the six slices. The y-axis is also labeled (on the right) by the total genus in the 
sample. 
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Fig. 10. — As in Fig. P, but for the Las Campanas Redshift Survey. Notice that the fit is not much 
improved after adding even terms, as expected when non-hnear structure growth is important. In 
the Simulated SDSS, however, the fit improved substantially. 
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